29 research outputs found
FSNet: An Identity-Aware Generative Model for Image-based Face Swapping
This paper presents FSNet, a deep generative model for image-based face
swapping. Traditionally, face-swapping methods are based on three-dimensional
morphable models (3DMMs), and facial textures are replaced between the
estimated three-dimensional (3D) geometries in two images of different
individuals. However, the estimation of 3D geometries along with different
lighting conditions using 3DMMs is still a difficult task. We herein represent
the face region with a latent variable that is assigned with the proposed deep
neural network (DNN) instead of facial textures. The proposed DNN synthesizes a
face-swapped image using the latent variable of the face region and another
image of the non-face region. The proposed method is not required to fit to the
3DMM; additionally, it performs face swapping only by feeding two face images
to the proposed network. Consequently, our DNN-based face swapping performs
better than previous approaches for challenging inputs with different face
orientations and lighting conditions. Through several experiments, we
demonstrated that the proposed method performs face swapping in a more stable
manner than the state-of-the-art method, and that its results are compatible
with the method thereof.Comment: 20pages, Asian Conference of Computer Vision 201
Video face replacement
We present a method for replacing facial performances in video. Our approach accounts for differences in identity, visual appearance, speech, and timing between source and target videos. Unlike prior work, it does not require substantial manual operation or complex acquisition hardware, only single-camera video. We use a 3D multilinear model to track the facial performance in both videos. Using the corresponding 3D geometry, we warp the source to the target face and retime the source to match the target performance. We then compute an optimal seam through the video volume that maintains temporal consistency in the final composite. We showcase the use of our method on a variety of examples and present the result of a user study that suggests our results are difficult to distinguish from real video footage.National Science Foundation (U.S.) (Grant PHY-0835713)National Science Foundation (U.S.) (Grant DMS-0739255
3D Face Reconstruction from Light Field Images: A Model-free Approach
Reconstructing 3D facial geometry from a single RGB image has recently
instigated wide research interest. However, it is still an ill-posed problem
and most methods rely on prior models hence undermining the accuracy of the
recovered 3D faces. In this paper, we exploit the Epipolar Plane Images (EPI)
obtained from light field cameras and learn CNN models that recover horizontal
and vertical 3D facial curves from the respective horizontal and vertical EPIs.
Our 3D face reconstruction network (FaceLFnet) comprises a densely connected
architecture to learn accurate 3D facial curves from low resolution EPIs. To
train the proposed FaceLFnets from scratch, we synthesize photo-realistic light
field images from 3D facial scans. The curve by curve 3D face estimation
approach allows the networks to learn from only 14K images of 80 identities,
which still comprises over 11 Million EPIs/curves. The estimated facial curves
are merged into a single pointcloud to which a surface is fitted to get the
final 3D face. Our method is model-free, requires only a few training samples
to learn FaceLFnet and can reconstruct 3D faces with high accuracy from single
light field images under varying poses, expressions and lighting conditions.
Comparison on the BU-3DFE and BU-4DFE datasets show that our method reduces
reconstruction errors by over 20% compared to recent state of the art